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Abstract 

Color consistency is crucial for both photo and com- 
mercial printing applications* Dot gain tables are 
currently updated sporadically, and between updates 
colors can shift due to process drift in the press. The 
goal is to dynamically control the dot gain table and 
developer voltage to ensure more consistent color con- 
trol while -minimizing waste and calibration measure- 
meats- 
Treating the problem as a machine-learning prob- 
lem, we predict the dot gain table values given the 
current state of the machine, as expressed in the val- 
ues of nineteen sensor measurements. Our initial in- 
vestigation based on a preliminary dataset shows that 
linear regression methods can predict the dot gain 
values with acceptable accuracy. 



1 Introduction 

Color consistency is crucial for both photo cmd com- 
mercial printing applications. Dot gain tables are 
currently updated sporadically, and between updates 
colors can shift due to process drift in the press. The 
goal is to dynamically control the dot gain table and 
developer voltage to ensure more consistent color con- 
trol while minimizing waste and calibration measure- 
ments. 



Currently the dot gain table and developer voltage 
are controlled by sporadically printing special calibra- 
tion jobs that print test patterns that can be observed 
and whose characteristics can be measured by the 
press. The calibration process first prints one or more 
test patterns with 100% ink coverage that are used to 
find the proper developer voltage setting for each ink 
in order for the ink thickness at 100% coverage to be 
correct. Once the developer voltage is set, the actual 
ink thickness or optical density at 100% coverage is 
measured. Then the calibration process prints one 
or more sheets of test patterns with monochromatic 
swatches of uniform digital dot area to measure the 
physical clot area for each of the digital dot areas* 

Broadly speaking there are two separate machine 
learning problems: (I) predict the developer voltage 
and corresponding ink optical density at 100% cov- 
erage per ink given the current machine state, and 
(2) predict the dot gain table values for each digital 
dot area of interest for each ink given the current ma- 
chine state, developer voltage, and ink optical density 
at 100% coverage. There are a number of possible re- 
lated problems, such as predicting the dot gain table 
values for one screen, given the current dot gain val- 
ues from a second screen. A large number of machine 
learning regression algorithms are applicable to these 
problems. We evaluate the accuracy of three common 
methods: artificial Neural Networks (NN), Support 
Vector Machines (SVM), and linear regression. 
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2 THE PRINTING PROCESS 



If a method is found to supply sufficiently accurate 
predictions, we can replace or augment the calibra- 
tion procedure with a prediction-based process that 
has much less impact on customer workflow and con- 
sumable usage. The minimal requirements for the 
Indigo press are that the absolute difference between 
the predicted dot area and physical dot area is less 
than 2 at least 67% of the time, and less than 5 at 
leasi; 95% of the time, for all digital dot areas. 

In order to achieve color accuracy and consistency 
the dot area must be accurate. Ib ensure that the re- 
quested dot area is printed, the HP Indigo press uses 
a dot gain look up table (LUT) to map the digital 
dot area to the actual printed dot area, To maintain 
color accuracy, a calibration procedure is performed, 
during which time the physical dot area is measured 
for each of various digital dot areas. This procedure 
consumes substrate, ink and time, which prevents fre- 
quent updates. Unfortunately, the dot gain table is 
only accurate at the time of measurement because 
the press is not static. Consequently, the dot size 
is not properly controlled and can fluctuate between 
measurements, potentially causing color consistency 
and image quality problems. 

The dot gain is defined in Equation 1. 

printed dot area M 

dot gavn = - , — (1) 

digital dot area 

Both the digital dot area and printed dot area axe 
expressed as a percentage of the area that is covered, 
where 100 means that the whole area is covered with 
ink. The dot gain table contains the printed dot area 
value from Equation I for each digital dot area of 
interest. 

The calibration process uses an inline optical den- 
sitometer to read the physical dot areas from a s watch 
of uniform density in a single color. Given various 
physical constraints, one may fit up to fifteen such 
swatches on a single sheet. Since the presses can have 
up to seven separations (inks), this implies that we 
may measure up co two digital dot areas for each sep- 
aration in a single sheet. 

As an alternative to the full calibration process, we 
might create a "fast calibration" process that mea- 
sures two points per color separation, and then uses 



the measured information .and the machine state to 
predict the rest of the dot gain lookup table values. 

We analyzed a dataset of dot gain LUT 's collected 
by HP Indigo. Our results for this dataset are promis- 
ing, and, in particular, are within the required lim- 
its. It is important, however, to keep in mind that 
this dataset is small by machine learning standards - 
approximately 130 samples for each screen and sepa- 
ration. 

2 The Printing Process 

Converting a digital signal to a physical dot on a piece 
of paper is an analog process that can be affected by 
any number of system elements and interactions. The 
process of image production consists of three stages 
(see Figure 1): 

L Image generation — ■ A latent image is created 
on the PIP foil. The PIP foil includes photo- 
conductive material . When exposed to light, this 
material becomes a conductor. The PIP is neg- 
atively charged by the Scorotron assembly. A 
laser beam originating from the Writing -Head is 
used to discharge specific areas on the PIP foil. 
These discharged areas comprise fche latent im- 
age. 

2. Image development - During this stage the la- 
tent image is developed by ink on the PIP. The 
Electrolnk consists of small colored ink. parti- 
cles that are electrically charged. The BID (Bi- 
nary Ink Development) units apply developed 
ink onto the discharged areas that compose the 
latent image on the PIP foil. 

3. Image transfer — During this stage the devel- 
oped image is transferred from the PIP to the 
Blanket that wraps the ITM. The image is then 
transferred from the Blanket to the substrate. 
The transfer of the developed image from the 
PIP to the Blanket is achieved through electri- 
cal and mechanical forces. The Blanket is pos- 
itively charged and is heated to about 100°C. 
This raises the temperature of the ink film on the 
Blanket that causes the ink particles to swell and 
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2 THE PRINTING PR OCESS 



Figure 1: Indigo Press 



Writing Head 



ScorotKm 



Blanket 



BID, 




to acquire a. gelatin -like form. At this stage, the 
developed image is Uansfarred from the Blanket 
to the substrate. 

Many key elements, such as the PIP foil and blanket 
are regularly replaced and each replacement part has 
its own characteristics. Thus, it is likely that a full 
dot gain table measurement will need to be taken af- 
ter each major part replacement. In addition, during 
normal operation other parameters, such as temper- 
ature, vary continuously 

For our purposes, the HP Indigo press has twenty- 
four observable parameters. 

Table 1 shows the list of observed parameters 
whose values are available to the dot gain table con- 
trol system. Some of the parameters are properties of 
physical devices, such as the blanket, PIP, ink batch, 
and corona 

According to HP Indigo the most important pa- 
rameters (while using a constant substrate) are prob- 
ably: developer voltage, ink separation, ink den- 



Table 1: Indigo Press Parameters 



Type 


Parameter 


Min 


Max 


Ink 


density* 
conductivity* 
temperature 
separation* 


T69 
67 

28.83 


2.36 
123 
31.6 


Imaging oil 


temperature 
dirtiness 


1846 
1.272 


21.03 
1.316 


ITM 


temperature 
blanket counter* 


1267 
66 


135.1 
58207 


PIP 


PIP counter* 
vlight/ vbaekgroiind* 
background qualifier 


46 
34 
-43 


86334 

90 

149 


Process 


developer voltage* 

vcorona 

icorona 

vgrid 

igrid 

impression coimter 
time and date 
corona age 
optical density 100% 
machine temperature 


-494.9 

-6092 

1.91 

-887-2 

-1.55 

.1285003 

L05 
22.25 


-345.7 

-5858 

2.33 

-540.6 

-1.07 

1376710 

1.8 
28.24 


Other 


screen* 







3 



HP Restricted 



i MM, I0t& 




6 A} 



3 MACHINE LEARNING METHODS 



Figure 2: Histogram of Developer Voltage Values 
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sity, ink conductivity, blanket counter, screen, PIP 
counter, and PIP v-li gh t/ v-backgroxiiid . These are 
marked with an asterisk in Table 1. 

During normal operation many of the parameters, 
such as the temperatures, are relatively stable. Some 
parameters, such as the various counters, change con- 
tinuously while other parameters, such as the devel- 
oper voltage, are used to control aspects of the print- 
ing process and generally vary within a standard op- 
erating range. 

The developer voltage for this dataset (collected 
from a Series 1 machine) is adjusted in steps of 8 
volts, although the final recorded voltage has some 
noise. A histogram of the total developer voltage ob- 
servations separated into 8-vott bins is givexi in Figure 
2, The distribution of the developer voltage values in 
this dataset appears to be bimodal, with the main 
mode occurring at -456, and a secondary mode at 
about ^-360. The extreme values in. the bin centered 
on -488 and the bin centered on -344 are very under- 
represented, as are the bins -384 and -376, between 
the two modes. 



3 Machine learning methods 

We used three methods for predicting the dot gain ta- 
bles: linear regression, neural networks, and support 
vector machines. Both neural networks and support 
vector machines fit non-linear multivariate functions 
to the training data. The Etti ng and analysis for each 
method was done using the R statistical package, a 
GNU software platform that largely re-implements 
the commercial package S-Plus. 

3.1 Support vector machines 

Support vector machines are a kernel-based approach 
to machine learning, A good tutorial introduction to 
SVM was written by Btirgesfl]. Other standard refer- 
ences on SVM include Vapnick(2| and Gristiamni|3j. 
Emm Piatt [4] the definition of the quadratic pro- 
gramming problem that for support vector learning 
is shown in Equation 2: 

1 1 1 1 

0<<*i <C,Vt (2) 
t 

]T yictt = 0 

where I is the number of (x. t y)samples, k(£iiXj) is 
the kernel function of two sample input vectors aft 
and Uimd %are the corresponding sample values, 
G is a given parameter, and oci&re being optimized 
by the training process. 

The quadratic programming problem is solved if 
and only if the Karush-Kulm-Tucker (KTT) condi- 
tions are fulfilled and Qij ■= yiVjk(%i z Xj)\& positive 
semi-definite. 

We used the radial basis function (RBF) kernel, 
where = e -y\s~vf^ t 0 train an SVM ma- 

chine with the RBF kernel, one must select two 
meta-parameter values: C and 7. We use a design- 
of-experiments (DOE) based method with cross- 
validation error estimation to select the best pa- 
rameter settings for each problem as described in 
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Figure 3: Feedforward NN Architecture 
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Sfcadm|5]. The SVM models are trained using lib- 
Btmt lil>rary|6l v with an R interface stipplied by the 
el 071 package* 

3.2 Artificial neural networks 

Neural networks are a well-known technique for ma- 
chute learning. A good introduction and descrip- 
tion, can be found in Bishop[7];. We used the neu- 
ral network package nnet itom R, The met package 
uses standard feed-forward neural network architec- 
ture with a single hidden layer and logistic activation 
functions. The networks are fitted using BFGS quasi- 
Newton optimisation, with the gradients supplied by 
backpropagation . 

In this architecture, each input node is connected 
to each of the hidden nodes, and we allow linear 
(skip) connections between the Input nodes and the 
output node. The output node is set to have lin- 
ear activation. Each hidden node is connected to the 
output node, as in Figure 2. We represent the con- 
nection weight between node i and node j by 1%-. 
Each network node is indexed: index 0 is a bias in- 
put with constant value 1, indices 1, . ..- 9 Nin ^ the 
input nodes, indices {N in 4- X), . - ♦ , (N in -r N h i<i) are 
the hidden nodes, and N in + N hid 4- 1 is the output 



node. If no connection exists between nodes i and j, 
then Wjj is fixed as a constant 0. 

Let X{ be the input to any node: j: input, hidden 
or output, and Zi the output from that node. The 
input and output for the input nodes are identical. 
The input and output functions for hidden node i in 
terms of previous hidden nodes and inputs are shown 
in Equations 3 and 4 respectively; Since we use linear 
output nodes, the output Zi for output node i = iV^H- 
Nnu + 1 is just xt- 

f-i 

Xi~^Wij2j ( 3 ) 



For; our experiments we used 2 hidden nodes, and 
fitted the networks using weight decay regularization 
(see [7|) with decay parameter 0.001. 

4 Results 

Our analysis is based on a dataset collected by HP 
Indigo in early February over a one week period on a 
single Series I Indigo Press by a single operator using 
the automatic calibration process. The dataset con- 
tains 269 dot gain tables each containing fifteen (15) 
physical dot area values for each of the four inks, and 
labelled with all the parameters appearing in Table 1. 
There are measurements for two screens, 136 tables 
for HDM75, and 133 tables for Sequin. 

4.1 Dot gain machine learning prob- 
lems 

The first and most important question was whether it 
is possible to predict the dot gain lookup table values 
with sufficient accuracy just given the current state of 
the machine, including the measured developer volt- 
age and OD100 parameters. 

Given the current state of the machine and a 
screen, we want to predict the physical dot area for 
each of n separations and m digital dot areas, in order 
to fully populate the dot gain lookup tables. We can 
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formulate the problem in several ways. One possibil- 
ity is to create a single monolithic machine-learning 
problem where the state of the machine, the selected 
screen, the separation, and digital dot area are all 
included as inputs, and the physical dot area is the 
output. At the other extreme, one may create sepa- 
rate machine learning; problems for each combination 
of screen, separation, and digital dot area, with the 
machine state as input and the physical dot area as 
output. There are a variety of intermediate formula- 
tions that create separate machine learning problems 
incorporating the: screen, separation, and digital dot 
area as inputs. These formulations trade off problem 
complexity with the number of models trained. 

We attempted several different intermediate prob- 
lem formulations. In the first, we solve separately 
for each color separation, digital dot size, and screen, 
giving n - m * $ distinct models or functions, where n 
is the number of color separation s, rn is the number 
of sampled digital dot sizes, and s are the screens. 
The second is to solve separately each color separa- 
tion and screen, resulting in n - models. Results 
from tests using the second type of problem were not 
promising so we report only on tests using the first 
formulation. 

Using "fast calibration" for two physical dot area 
measurements, we can pose an additional learning 
problem where the inputs include the device param- 
eters as well as the two measured physical dot areas 
and the output is the dot gain area. For this problem, 
we again have a choice of the problem formulations 
above, and we selected the formulation that worked 
best in the first set of tests. We also have a number 
of choices with respect to the selection of points to 
include as the physical dot area inputs. We can select 
to use one or two points as input, and each of these 
points can be selected from the fifteen dot areas in 
the LUT\ We tested all possible one point inputs and 
promising combinations of two point inputs, but re- 
port results only for one combination per separation. 

We looked at a few of the possible problem for- 
mulations and found that the best solution was to 
have a separate machine learning problem for each 
screen, separation, and digital dot area. Combining 
problems always reduced the system accuracy, so we 
report results for this problem formulation only. For 



a single screen using four inks with fifteen digital dot 
areas this results in sixty independent machine learn- 
ing problems. 

We use 10-fold cross validation to evaluate the ex- 
pected prediction error. Note that SVM repeats its 
parameter search algorithm for. each fold, so the test 
data in each fold is not included in the training data 
used by the parameter search. 

The prediction errors are the difference between 
the true printed dot area and the predicted print dot 
area. The mean error for all tests are very close to 
zero indicating that the predictions are unbiased, We 
analyzed the prediction errors using a Chi-squared 
goodness of fit test and found that they are approx- 
imately normally distributed. Therefore, we can use 
the standard methods for computing the confidence 
intervals. 

The graphs in Figure 4 show the 67% prediction er- 
ror confidence intervals for the 1751pi HDI-175 screen 
for each of the three machine learning methods: lin- 
ear regression, neural networks, and support vector 
machines as a function of the digital dot area. Note 
that each ink behaves slightly differently. 

Figure 5 shows the analogous graphs for the lower 
resolution screen 1201pi screen Sequin. 

From the results in Figures 4 and 5, it is appar- 
ent that; the behavior of ail methods is similar. This 
means that the prediction of 11 hard" points is invari- 
ant of the learning method. Since linear regression 
performs comparably to the more complex non-linear 
methods, all further analysis was done using linear 
regression. 

Figure 6 shows the 95% confidence intervals for 
each ink for both the HDI-175 and Sequin screens. In 
all cases, the 95% confidence interval is smaller than 
4, which is better than the minimal requirement of 
a 95% confidence interval of 5. The 67% confidence 
intervals can be seen from Figures 4 and 5, and they 
are always less than 2. Clearly, the requirement of at 
least 67% of errors less than 2 is also met. 

4.2 Dot gain prediction using two 
measured points 

Next we analysed the impact of adding two measured 
points might have on prediction accuracy. The first 
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4.2 Dot gain prediction using two measured points 



Figure 4: Prediction Error 67% Confidence Interval, HDM75 Screen 
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Figure 5: Prediction Error 67% Confidence Interval, Sequin Screen 
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4.3 Dot gain variation over time 



Figure 6: Prediction Error 95% Confidence Intervals for Linear Regression 
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stage was deciding which two points to add. After 
an evaluation process it was found that measuring 
the physical dot area at digital dot areas 23 and 40 
yielded the best, performance, improvement for HDL 
175, and digital clot areas 16 and 40 yielded the best 
improvement for Sequin . 

Figure 7 shows the 67% and 95% prediction confi- 
dence intervals for EDI- 175 and Sequin with two mea- 
sured points at 23 and 40 for HDT-175 and 16 and 40 
for Sequin. In all cases, the 67% and 95% confidence 
intervals are reduced, sometimes markedly. The pro- 
cess of obtaining the measurements is not ideal, as 
it requires the operator to interact with the press to 
get the measurements. However, a full calibration re- 
quires many sheets of paper (e.g. 15 sheets for 7 inks) 
to measure all. points while a single sheet is required 
to collect the data for two points. 

4.3 Dot gain variation over time 

In order to deploy dot gain prediction effectively, we 
need to know bow often the tables should be updated. 
If the tables are updated too frequently, then the 
color consistency can be reduced because the table 
values are changing much faster than the underlying 
physical process. If the tables are updated too infre- 
quently, then the press can drift out of control and 
color consistency is again reduced. The goal is to 



update the dot gain tables when it is likely that the 
press is drifting out of control. 

As a first step we must determine how fast the press 
drifts as a function of the number of impressions. To 
do this we look at the dataset as a time series and 
compare the measured dot gain values for each LUT 
with the dot gain values for each subsequent LUT 
and look at the resulting data as a function of LUT 
change versus the intervening impression count. 

Figure 8 shows how the LUT table entries vary 
between measurements as a function of the number 
of inter veiling impressions. More precisely, it shows 
the standard deviation for changes in the LUT val- 
ues. Since LUTfe were not taken at fixed intervals, 
in terms of impressions, we binned the data into 
500-impression buckets. Clearly the system can drift 
fairly quickly, so updating the LUTs as frequently as 
every thousand impressions would likely improve the 
color constancy. 

One caveat with the results in Figure 8 is that the 
developer voltage is adjusted as the fust step in the 
calibration process, so between each measurement the 
develop voltage is updated. During normal opera- 
tion, the developer voltage would not be updated so 
frequently, so the actual variation in LUT values may 
be smaller than this graph indicates. 
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Figure 7: Prediction Error Confidence Intervals for Linear Regression When Given Two Measured Points 
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4.3 Dot gain variation over time 
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Figure 8: Standard Deviation of Physical Dot Area Differences 
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4 A Dot gain prediction parameter se- 
lection 

The importance of a parameter may be measured by 
the effect of removing that parameter from the model 
If the predictive power of the model is unaffected (or 
even improved) we may conclude that the parameter 
is not significant. On the other hand, if the model 
has a significant degradation in performance without 
a particular parameter, we may conclude that this 
parameter is significant and should be retained. 

To implement this method we proceeded in the fol- 
lowing manner. Firstly, for each statistical learning 
technique tested, model predictions were obtained for 
the entire dataset using the whole set of predictors. 
The predictions were obtained using the same 10-fold 
cross validation technique described in [4|. That is, 
for each df 10 folds of the dataset, a model was fitted 
to the remaining 9/10 of the data, and a prediction 
made on the 1/10 excluded from the fitting. This 
prediction is taken as the model prediction on that 
portion of the dataset. The procedure is carried at 
on each slice of that dataset until a prediction is ob- 
tained on the whole dataset. 

The sum of squared errors- (SSE) of these predic- 
tions was computed on each of the digital dot values 
of the LUTfe, where both screens and all separations 
were included in this sum. Then similar predictions 
and SSE computations were made for models- fitted 
excluding each of the input parameters in order. At 
the end of this, we have a matrix a row for each in- 
put parameter, and a single row for the model in- 
cluding all parameters, and a column for each digital 
dot area (DB A) value. The matrix entries are the 
SSE valuers of the model prediction on the relevant 
DDA, where the model used to predict excludes the 
input parameter corresponding to that row. The ma- 
trix rows resemble the sample in Table 2, where the 
rows and columns stretch over the full range of input 
parameters and DDA values respectively. 

We repeat this experiment 20 times and from this 
we are able to estimate the mean and standard devi- 
ation of the SSE value for each of the table entries. 
Using these estimates we are able to generate confi- 
dence intervals for the differences of means between 
the original (full) model and each of the depleted 



models, for each DDA value. 

Figure 9 summarizes these : results. The error bars 
give the 95% confidence interval for the mean of the 
relevant depleted model SSE minus the mean of the 
full model SSE. If the confidence interval does not 
include the zero line in a particular case, then, we 
can conclude (at the 0.05 level) that the parameter 
under consideration is relevant for the dot gain LUT 
prediction. Prom the graphs in Figure 9, we obtain 
the following categorization of the model parameters: 

Significant parameters: blanket counter, vlight, 
background.qualifier f ODlOO, vcorona, vde~ 
veloper, iimJemp, oiidirt ink: eonductimty, 
ink.density. 

Not significant parameters: vgrid, igrid t icorvna, 
machine. hemp, oil temp, ink. temp. 

These results are obtained by deleting one parameter 
and recomputing, so they must be applied with some 
caution. For example, there may be two highly corre- 
lated parameters 5 in which ease deleting one of them 
would not significantly affect the model, but delet- 
ing both may reduce the model accuracy. We shall 
consider the order of parameter deletion and possible 
correlations in the next section. 

Note also that the significant parameters are effec- 
tive at different DDA values. The blanket.counter, 
vlight, itrmteuip variables are effective mostly at the 
lower DDA values. OD100, background.qualifier, 
inkxonductivjty are effective in the mid range, and 
vcorona, vdeveloper. oiLdirt and ink,density have the 
biggest effect at the high range of DDA values. This 
behavior does make it difficult to rank the significant 
parameters, since the final ranking will depend on 
which range of the dot gain LUT is considered more 
important, and whether a comparatively small error 
over a wide range of values is preferable to a large 
error over a small range of values. In the following 
ranking I assume that all DDA values are equally 
significant, and the goal is to minimize the maximum 
absolute error. The most important parameters are 
ranked 1 , the least important significant parameters 
are ranked 4, and the insignificant parameters are 
ranked 5. 
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4.4 Dot gain prediction parameter selection 



Table 2: Sample SSE matrix for DBA model predictions 





X2 


X4 


X6 


X9 


X12 


X14 


all 


825.3441 


1499.899 


1759.544 


2342.148 


1786.486 


1855.435 


-blanket.couur.er 


935.4560 


1729.036 


2025.373 


2707.947 


1929.873 


1917.133 


-vlight 


857.9499 


1463.702 


1828.056 


2375.569 


1823.233 


1838.971 


-background .qualifier 


862.4410 


1518.113 


1819.945 


2480.031 


1886.576 


1845.397 


-OD100 


843.5559 


1497.256 


1821.808 


2363.896 


1895.075 


1996.277 


-vgrtd 


859.4362 


1451.960 


1773.247 


2281.338 


1766,356 


1761.680 


-igrid 


814.0722 


1449.212 


1758.089 


2345.243 


1768.262 


1789.235 


-vcorona 


866.6734 


1502.491 


1826-697 


2440-609 


1808.491 


1941.426 


-icorona 


814.5600 


1436.776 


1726.267 


2271.425 


1758.439 


1797.152 


-vdeveloper 


843.0000 


1500.752 


1822.115 


2440.976 


1795.174 


1886.793 



Figure 9: Difference of means 95% confidence intervals fbr depleted versus full linear regression models. 



i s 

] t 



-i — r™"» — * — " — ' — r* 
2 i « * » *a w 



% t te *a « 



I 3; 



I". 



™t r — v — r~~n ) r- 

2 « t * * « H 



1 ? : 
i,l 


1 — , , , r~-i — 




1*4* »a is « 






1 I: 

i:= 


I , { , { M ( , f , 1 




3 « « * 0 *3 « 






I ^ : 

•i 8 - 





3 i * 6 <0 W H 



I i 



* ?' 1—— . ■ » i n ■ ■ 



3 * a * 



i »J 

i 



~i — i — i — i — ? — r— r- 

> a .e * ' »» i« 



" t i 1 1 1 — "*i r- 

3 * S » «> 13 M 



ii-: 



t i » 1 i > ? r- 

2 * l « }a u 



-t — i — ! rr-i — r — r 

7 4 - j ■■ « : W 13 «• 



-i — ? — i r 

a < • « «t u 



n — « — 5 — i — t — : — r- 

8 4 ft « «3 t? t* 



-4niw:onducUw»t¥ 



1 

■1 









Hit! 



~7 1 1 : r- 

S tt ID *7 U 



1*1 



i r~ 

3< 8 9 tO 13 U 



1 !: 



s 9 : 

r 1 — , 1 i > 1 r— 

i 4 fl. • fQ 12. 14 



13 



HP Restricted 





4.5 Developer voltage prediction 



4 RESULTS 



Figure 10: Difference of means 95% confidence inter- Figure I I : Difference of means 95% confidence in- 
vals for linear regression model deputed of all rank 5 terval for linear regression on full parameter set and 
parameters against full model final depleted parameter set 




1. blanket counter, vdevefopzr 

2. vcomndy Um.temp, ink, conductivity 

3. ink. density, OD100 

4. utight, oil dirt) backgwund, qualifier 

5. vgrid, igrid, icoivna, machine, temp t oil temp, 
ink, temp 

4.4.1 Assessment of Insignificant Parameters 

As stated above, the fact that deleting a single pa- 
rameter does not significantly affect the model per- 
formance does not necessarily imply that deleting 
all such parameters will not affect the model perfor- 
mance. In fact,. In the present case, deleting all the 
rank 5 parameters in this model leads to the differ- 
ence of means plot In Figure 10. 

Clearly deleting the whole set of rank 5 parame- 
ters does affect the model performance. Therefore to 
properly understand the significance of these param- 
eters we need to find a way to delete them in some 
sensible order and view the results. At each step we 
shall perform the procedure detailed in the first sec- 
tion for obtaining a difference of means confidence 
interval for all the remaining rank 5 parameters. We 



shall split the rank 5 parameter set into 2 groups (a 
and b) in the following way. At step * the parame- 
ter giving the smallest sum of absolute difference of 
means is assigned to group a. Then confidence inter- 
val plots are obtained for all parameters not currently 
assigned to groups a or b. If these plots demonstrate 
that a parameter is significant in the performance of 
the model, we assign it to group b and continue until 
there are no unaligned parameters left. 

At the end of this procedure the following four 
parameters were found to be insignificant: ma- 
chine, temp, igrid, oil.temp, ink.ttmp. The paramer 
ters icorrma and vgrid were found to be significant. 
As a final sanity check we fit a new full model with 
the whole parameter set, and a second model with the 
reduced set of parameters (less the four parameters 
deemed insignificant above). The confidence interval 
plot for the difference of means based on this exper- 
iment is seen in Figure 11. Clearly the reduced set 
of parameters does not hinder the model prediction, 
and in some cases may even improve the prediction, 
suggesting the presence of possible overrating. 

4.5 Developer voltage prediction 

This is a preliminary report on the developer voltage 
prediction problem. Measuring and setting the de- 
veloper voltage is required pre-cursor to the Indigo 
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press LUT calibration procedure, It is ail iterative 
process that consumes significant resources. We pro- 
pose a statistical learning approach to this problem, 
■with the intention of replacing the machine proce- 
dure, or at least reduce the resource consumption by 
supplying a "good" starting point. 

As noted in the introduction, the procedure for set- 
ting the developer voltage is an iterative procedure 
that requires significant consumable resources. Since 
our goal is to reduce both consumable wastage, and 
machine "down-time", we wish to either replace or re- 
duce the time and resources necessary for such pro- 
cedures. In the case of developer voltage prediction, 
the set of measurable parameters available to us are 
given in Table I. 

Note that the optical density 100% is not available 
since this is measured as part of the developer voltage 
calibration procedure. 

We apply the same statistical learning methods 
employed in LUT prediction to model the developer 
voltage as a function of these quantities. That is, 
linear regression, neural networks and support vector 
machines. The developer voltage observations were 
denoised before fitting, models. That is. each obser- 
vation was allocated the value of the nearest 8-volt 
increment. This helps prevent the model from fitting 
noise artifacts. 

The statistical learning problem associated with 
the prediction of developer voltage is an ordinal re- 
gression problem [8|. In this problem formulation, a 
function of the predictors (i.e. the press parameters) 
is sought that predicts the rank of the developer volt- 
age in the range of possible developer voltages. In the 
case of this data, there are 1.9 classes, the lowest rank 
(class 1) being -488 volts, and the highest rank (class 
19) being -344 volts. There are known linear and non- 
linear techniques for solving this sort of problem (see, 
for example [9]), however they generally require that 
all classes be well represented in the dataset. This is 
not the case with the dataset under consideration. 

Therefore, we treat the problem as a simple re- 
gression problem, and then take the class nearest 
the model prediction as the predicted developer volt- 
age. The fitted models interpolate in the under- 
represented regions and can therefore still provide 
predictions that should be sensible. 



Figure 12: Developer Voltage for HDM75 and Black 
Ink 
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We split the dataset into eight subsets correspond- 
ing to the two represented screens (Sequin and HBI- 
175) and the four represented separations for each 
screen (Black, Cyan, Magenta and Yellow). Figure 
12 shows an example of the developer voltage val- 
ues for a single screen and separation (in this case, 
HDI-175 and Black:, respectively). For each of these 
subsets we apply 10-fold cross validation to fit mod- 
els and obtain an independent prediction on each el- 
ement in the dataset. The cross validation folds are 
taken from 10 randomly selected subsets that span 
the subset under consideration. 

Using the above problem formulation we obtained 
developer voltage predictions on the given dataset for 
each of the statistical learning methods: linear regres- 
sion, neural networks and support vector machines. 
The linear regression models included a stepwise pa- 
rameter selection based on the AIC (see. for example, 
[10]). The neural network models had a single hid- 
den layer with 5 hidden nodes and a non-linear out- 
put node. The support, vector machine models used 
a radial basis function kernel, with hyper-parameters 
set by the parameter selection algorithm described in 
Staelm[5j. 

As stated above, the resulting predictions are 
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Figure IS: Prediction Error Histograms for Different Models 
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Figure 14: Prediction Errors for Neural Network Models on HDM 75 Screen 
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Figure 15: Prediction Errors for Neural Network Models on Sequin Screen 



N«u. Nat Pio&ohn Zttan for $*<!uto Strum and Bteck Ink 



~» " T~ 



-St 0 2 



Neu, Net Prediction Er ro/« for. 8<Ki«tn Scrum and Cyan Ink 



r — \ r~ 



4 8 



Hen. Hot f>r«Otc«oft Errors for Sequin Screen »*d Magenta l«k 



Meu, Net Prediction error* for Sequin Scroon awt Yellow tnk 



-4 -2 0 2 < « 



-i ! ; 

2 4 8 



18 



HP Restricted 



i Ml idea 




4 RESULTS 



4.5 Developer voltage prediction 



Figure 16: Prediction Error Histograms for Different Models on Different Screens 
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rounded to the nearest 8-volt class. We then con- 
sider the discrepancy tit number of classes between 
the predicted and actual values. That is. if a predic- 
tion of 432 is made when the actual class is 316, the 
reported error is . The histograms of the resulting 
prediction errors for each statistical learning method 
are given in Figure 13. There is also a per-screen 
breakdown of these graphs at the end of the article 
(Figure 16). In all cases the predicted voltage class is 
within 2 classes of the correct class for at least 99% 
of the predictions. For the linear regression models, 
the predicted class is within 1 class of the correct 
class at least 90% of the time, and both the non- 
linear methods (neural networks and svm) achieve a 
prediction within 1 class of the correct class at least 
95% of the time, with neural networks slightly more 
accurate. The separation breakdown for .neural net- 
work predictions and the MDH75 screen is given in 
Figure 14 (Sequin screen in Figure 15). Note that the 
results are fairly consistent across separations. 

4*6 Dot gain table prediction revisited 

The results so far suggest that we can predict the de- 
veloper voltage reasonably accurately given the cur- 
rent machine state (Section 4.5), and that given, the 
developer voltage and OD100 and the current ma- 
chine state we can predict the dot gain tables rea- 
sonably accurately (Section 4.1). The next question 
is whether we can predict both the developer voltage 
and dot gain values given just the machine state. Un- 
fortunately, our results also suggestion that OD100 is 
an important parameter (Section 4.4). 

Figure 17 shows the 67% and 95% prediction er- 
ror confidence intervals for both HDI-175 and Sequin 
screens using neural networks for non-linear regres- 
sion. By comparing these graphs with those in Fig- 
ure 6 we can see that a noticeable deterioration in the 
predictions at the 95% level. This is not surprising 
since we have omitted two parameters that are known 
to be significant from the model. None-the-less the 
plots suggest that we can predict; the dot gain LUT 
values to an accuracy close to Indigo's acceptance cri- 
teria even without these quantities. 

Of course, the value of such an observation depends 
entirely on the usage model. Presumably the devel- 



oper voltage needs to be calibrated from time to time 
to provide an acceptable optical density 100%. As 
demonstrated above, we can determine a suitable de- 
veloper voltage using a machine learning approach. 
The machine would then be set to that value, and 
the dot gain LUT prediction could be made includ- 
ing developer voltage as a known parameter. 

4.7 Gross- screen prediction 

One of the variable quantities in the setup of the HP 
Indigo Press is the printing screen. The machine op- 
erator may exchange a printing screen for a different 
screen during printer operation. Generally this pro- 
cess requires recalibration of various aspects of the 
machine, one of which is the dot gain LUT. Suppos- 
ing the machine state (temperatures, ink character- 
istics etc) does not change significantly during the 
screen exchange operation, and further that the dot 
gain LUT being usee! by the machine for the previous 
screen is "current" in some sense, then we may hope 
to discover a mapping between the dot gain LUTs 
for the two screens that will save some or all of the 
manual LUT calibration. Using the existing dot gain 
LUT dataset, which contains roughly equal LUT val- 
ues for the Sequin and HDI-175 screens, we were able 
to extract a subset of 84 paired LUT measurements, 
in which the LUT measurements for the two screens 
correspond to approximately the same madiine state. 
Each measurement includes 4 separations, giving a 
total of 336 paired LUT samples. Since the goal here 
is to discover a mapping between a LUT of one screen 
type and a LUT of the other screen type, we treated 
all separations together. This investigation is only 
preliminary, and certainly will require further exper- 
imentation on a larger dataset to verify the results. 
Figure 18 gives a plot of the printed dot area (PDA) 
values for one screen against the other, where each 
subplot corresponds to the digital dot area (DDA) 
given in the plot title. The apparent structure in 
these plots does suggest there is some relationship 
between the two screen LUTs, although it is quite 
weak in some cases. 

In this investigation we shall only consider neural 
network models and attempt to predict the PDA for 
each DDA on one screen based on the whole LUT for 
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4.7 Cross-screen prediction 



Figure 17: 67% .and 95% Confidence ihtovals,mth,iieural ;n'^work ; prediction 



«7%.C(mlkto«u» iW»v»rr«r Noural W*< {rfftf*) Paction >rtfh HOM H'ttcittvn 





67% Co«fitf««e« lnt»rval for N«ufdl H*i ^»pl.) Pr^ktton with Boqui« 8cr«<tf» 




95<& Co»rfW««e* Interval for 'ttcwjwl N«t {topi) PretMkm with Squirt Scr*«« 




HP Restricted 




4. 7 Cross-screen prediction 



4 RESULTS 



Figure 18: Scatterplots of PDA values from the Sequin Screen against PDA values from the HDI-175 Screen 
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5 CONCLUSIONS AND FUTURE WORK 



the corresponding other screen. Thus to predict the 
EDM 75 LUT (for example) from the Sequin LUT re- 
quires training 17 networks on the learning dataset 
The same number of networks is required to perform 
the reverse mapping. Orice again, we are interested 
in finding a direct mapping between LUTs from two 
different screens, so we do not include any extra ma- 
chine state parameters in the models. We have pre- 
viously shown that given the machine state, one can 
predict the dot gain LOT values to acceptable ac- 
curacy, so it would not be surprising to find models 
fitted using machine state parameters do correspond- 
ingly well In this investigation we shall use only the 
dot gain LUT for one screen to predict the LUT for 
the other. The method used is that of the previous 
investigations - namely an independent prediction is 
obtained on each point in the dataset by using 10-fold 
cross validation technique. We use this prediction to 
assess the accuracy of the LUT fit. 

The 67% and 95% confidence intervals for both 
comparisons (HDI~> Sequin and Sequin->HDI) are 
show in Figure 19. In both cases the predictions fall 
within an absolute difference of 2 at least 67% of the 
time, and an absolute difference of about 4 at least 
95% of the time. Tins meets Indigo's accuracy re- 
quirement for LUT prediction. 

5 Conclusions and future work 

From the initial dataset it appears that given the 
measurable parameters from Table I we can predict 
the various dot gain values with acceptable accuracy 
using linear regression. This should allow HP In- 
digo to greatly improve the color consistency for their 
presses, while reducing both the consumable waste 
and workflow disruption. 

We are surprised to see linear regression obtain 
results equivalent to the non-linear learning meth- 
ods (mma\ networks and SVM). It seems counter- 
intuitive that the model for this problem is a linear 
one. We attribute the success of linear regression to 
the fact that the dataset was relatively small and we 
strongly suspect that given more data, the non-linear 
methods will produce better results. In future we 
plan to run more experiments using all three meth- 



ods as we collect more data. 

The set of input parameters for dot gain 
LUT prediction using: a linear regression model 
may be reduced to the following subset: blan- 
ket.counter, vdeveloper, vcorona, itm.temp, 
ink.conduetivity, ink.density, ODlOO, vlight, oil.dirt,. 
background.quaiifier,vgrid 5 icorona. That is, the 
following four parameters were found to not sig- 
nificantly affect the model performance: igrid, 
machine.temp, oil.temp, ink.temp. It is difficult 
to determine the relative importance of the input 
parameters since single deletions do not account for 
parameter interactions. Still, as a rough indicator 
we would suggest the following initial ranking: 

L blankeLcountar, vdevelQper 

2. vcownix, itmAemp, ink.wnductivity 

& inkJemiiy, OD10Q 

4 . vlight, oil dirt, background, qualifier 

5. vgiid, icorona 

This introductory study of the developer voltage pre- 
diction problem suggests that we are able to predict 
the developer voltage given the machine state param- 
eters with a high degree of accuracy. In particular, 
if an error of at most one 8- volt step is acceptable* 
then statistical learning methods can supply accept- 
able predictions more than 95% of the time. On the 
other hand, if an exact value is demanded, the models 
investigated here can give a starting point that will 
be accurate approximately 60% of the time, at most 1 
step off approximately 95% of the time, arid at most 
2 steps off approximately 99% of the time. This is 
likely to yield significant savings in consumables and 
calibration time for users of the Lidigp press. 

These results are obtained on a relatively small 
dataset, operated with only 2 screens and a single 
substrate type. The conclusions obtained here should 
be verified on a much larger data sample, or alterna- 
tively as an experimental implementation on a func- 
tioning press. Future work, based on a larger dataset, 
may yield more accurate results as the more appro- 
priate ordinal regression models could be utilized. 
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67* Confidence interval for MDI-175 and Sequin Cre*s Screen Prediction* 



05% Conftdenco Interval for HDM7S and Sequin Cro» Screen Prediction* 





Figure 19: Prediction Error 95% Confidence Intervals for Cross-screeb Prediction Using Neural Networks 



A number of questions need to be addressed before 
this can be sent to customers' presses. For exam- 
pie; we will need to evaluate the best update interval, 
e.g. how often should the system update the dot gain 
tables using model-based prediction with no printed 
measurements? How often should we use the "fast 
calibration" to get more accurate predictions? How 
often do we need to do full calibrations? How often 
should we update or refit- the models to incorporate 
information from full calibrations? Other questions 
regard a:oss~machine measurement and prediction. 
For example, are individual presses idiosyncratic, or 
can we use measurements from one machine to pre- 
dict the behavior of another machine? 

Since the first step in the calibration process in- 
volves an iterative process to set the developer volt- 
age, can we either predict the correct developer volt- 
age setting directly, or can we use the prediction to 
get more accurate initial conditions for the iterative 
search? 

Over the next several months we expect to proto- 
type a dot gain prediction system that is embedded 
in a press. We can them begin conducting various 
experiments to evaluate the actual improvement in 
color consistency when compared to the current prac- 
tice of sporadic LUT calibrations. 
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